I am working with two datasets. The first contains cost of living information for major United States cities in 2018. It has 132 cities and two corresponding values. The first is called ‘rent index’ this index is relative to New York City. For New York City, all indecies are 100. For example, if another city has a rent index of 120, this means on average that cities rent is 20% more expensive than New York City. The other value is cost of living which is the
Prep work on cost_of_living data and connected it to lat and long data for mapping.
cost = cost[-1]
cost = separate(cost, City, into = c("city", "state","country"), sep = ',')
cost = cost[-3]
cost = separate(cost, state, into = c("delete", "state"), sep = ' ')
cost = cost[-2]
cost$state = abbr2state(cost$state)
convert = convert[,c(1,4,9,10)]
convert$state = convert$state_name
convert = convert[-2]
cost_map = left_join(cost,convert,by = c("city","state"))
str(cost_map)
## 'data.frame': 132 obs. of 6 variables:
## $ city : chr "New York" "San Francisco" "Anchorage" "Honolulu" ...
## $ state : chr "New York" "California" "Alaska" "Hawaii" ...
## $ Cost.of.Living.Index: num 100 97.8 95 94.2 93.8 ...
## $ Rent.Index : num 100 115.4 40.1 62.8 76.2 ...
## $ lat : num 40.7 37.8 61.2 21.3 40.7 ...
## $ lng : num -73.9 -122.4 -149.1 -157.8 -73.9 ...